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Abstract: Wc consider a biological population in which a beneficial mu- 
tation is undergoing a selective sweep when a second beneficial mutation 
arises at a linked locus and we investigate the probability that both muta- 
tions will eventually fix in the population. Previous work has dealt with the 
case where the second mutation to arise confers a smaller benefit than the 
first. In that case population size plays almost no role. Here we consider the 
opposite case and observe that, by contrast, the probability of both muta- 
tions fixing can be heavily dependent on population size. Indeed the key 
parameter is pN, the product of the population size and the recombination 
rate between the two selected loci. If pN is small, the probability that both 
mutations fix can be reduced through interference to almost zero while for 
large pN the mutations barely infiuence one another. The main rigorous re- 
sult is a method for calculating the fixation probability of a double mutant 
in the large population limit. 
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1. Introduction 

Natural populations incorporate beneficial mutations through a combination of 
chance and the action of natural selection. The process whereby a beneficial mu- 
tation arises (in what is generally assumed to be a large and otherwise neutral 
population) and eventually spreads to the entire population is called a selec- 
tive sweep. When beneficial mutations are rare, we can make the simplifying 
assumption that selective sweeps do not overlap. A great deal is known about 
such isolated selective sweeps (see e.g. Chapter 5 of Ewens 1979). Haldane (1927) 
showed that under a discrete generation haploid model, the probability that a 
beneficial allele with selective advantage a eventually fixes in a population of size 
2N, i.e. its frequency increases from 1/{2N) to 1, is approximately 2a. Much 
less is understood when selective sweeps overlap, i.e. when further beneficial 
mutations arise at different loci during the timecourse of a sweep. 

Our aim here is to investigate the impact of the resulting interference in the 
case when two sweeps overlap. In particular, we shall investigate the probability 
that both beneficial mutations eventually become fixed in the population. Be- 
cause genes are organised on chromosomes and chromosomes are in turn grouped 
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into individuals, different genetic loci do not evolve independently of one an- 
other. However, in a dioecious population (in which chromosomes are carried 
in pairs), nor are chromosomes passed down as intact units. A given chromo- 
some is inherited from one of the two parents, but recombination or crossover 
events can result in the allelic types at two distinct loci being inherited one from 
each of the corresponding pair of chromosomes in the parent. We refer to these 
chromosomes as 'individuals'. 

Each individual in the population will have a type denoted ij where i, j G 
{0,1}. We use the first and second digit, respectively, to indicate whether the 
individual carries the more recent or the older beneficial mutation, and assume 
that the fitness effects of these two mutations are additive. Suppose that a single 
advantageous allele with selective advantage ai arises in an otherwise neutral 
(type 00) population of size 2N , corresponding to a diploid population of size 
A^. We use Xij to denote the proportion of individuals of type zj, then the 
frequency of the favoured allele, Xqi, will be well-approximated by the solution 
to the stochastic differential equation 



dXoi =aiXoi(l-Xoi) ds + ^ ^Xoi{l - X,n) dW{s), (1.1) 

where s is the time variable, {VF(s)}s>o is a standard Wiener process, and 
Xoi(O) = 1/(2^) (Ethicr & Kurtz 1986, Eq. 10.2.7). If the favoured allele 
reaches frequency p, then the probability that it ultimately fixes is 

2 _ g-2Ar<Ti ■ 

If a sweep does take place then (conditioning on fixation) we obtain 



dXoi = (TiXoi(l - Xoi)coth(iVaiXoi) ds + ^ ^Xoi(l - X^i) dW[s) 

and from this it is easy to calculate the expected duration of the sweep. Writing 
Tfix = inf{.s > : Xoi{s) ~ 1 A'oi(O) = 1/(2A^)}, we have (see for example 
Etheridge et al. 2006) 

E[f/„] = — log(2iVai) + O ( — ] (1.2) 

CTl VO"!/ 

and the variance var[Tfix] is 0{l/a1). More generally, an analogous Green func- 
tion calculation to that leading to equation (1.2) gives that the expected time 
for the selected locus to reach frequency e{N) is log(2A''(Tie(A^))/(Ti + 0(l/cri). 
This is the same as the expected time for Xgi to increase from 1 — e{N) to 1. 
On the other hand, for 5 = 0(1), the time for Xgi to increase from 5 to 1 — 5 
is C(l/(Ti). As a result, for large populations, during almost all the timecourse 
of the sweep Xqi is either close to zero or close to one. 

Now suppose that during the selective sweep of type 01 described by (1.1), 
more specifically, when Xqi reaches a level [/, another beneficial mutation with 



imsart ver. 2006/03/07 file: cef8.tex date: November 29, 2008 



1 INTRODUCTION 



3 



selection coefficient <72 occurs at a second linked locus in a randomly chosen 
individual, and the recombination rate between these two loci is p. If we assume 
that the arrival time of the second mutation is uniformly distributed over the 
timecourse of the sweep of the first mutation and that N is large, then we can 
expect either U or 1 — [/ to be close to but ^ 1/{2N). The new mutation 
can arise in a type 00 or 01 individual, forming a single type 10 individual in 
the former case, and all individual in the latter case. If the second mutation 
arises during the first half (in terms of time) of the sweep of the first mutation, 
then U is likely to be very small and it is more likely for a type 10 individual 
to be formed. Otherwise, the second mutation arises during the second half of 
the sweep and the formation of a type 11 individual is more likely. 

The case of the second beneficial mutation forming a type 11 individual is 
relatively straightforward. Since type 11 is fitter than all other types, its fixation 
is almost certain once it becomes 'established' in the population, i.e. when the 
number of type 11 individuals is much larger than 1. If the population size is very 
large, then it only takes a short time to determine whether type 11 establishes 
itself, and we can assume the proportion of type 01 individuals remains roughly 
constant during this time. Hence the fixation probability of type 11 is essentially 
its establishment probability, which is approximately 2{a2 + cri(l — f/)), twice 
the 'effective' selective advantage of type 11 in a population consisting of 2NU 
type 01 and 27V(1 - U) type 00 individuals. 

The case of the second beneficial mutation forming a type 10 individual is far 
more interesting. In order for both mutations to sweep through the population, 
recombination must produce an individual carrying both mutations. The relative 
strength of selection acting on the two loci now becomes important. The case 
of tTi > (72 has been dealt with in Barton (1995) and Otto & Barton (1997). 
Here, since type 01 is already present in significant numbers when the new 
mutation arises (and type 01 is fitter than type 10), the trajectory of Xqi is 
well approximated by the logistic growth curve 1/(1 + cxp(— <Tit)) until Xn 
reaches a level of 0(1). At that point, fixation of type 11 is all but certain. 
Barton (1995) then uses a branching process approximation to estimate the 
establishment probability of a type 11 individual produced by recombination. 
In particular, his approach is independent of population size. Not surprisingly, 
he finds that the fixation probability of the second mutation is reduced if it 
arises as a type 10 individual, but increased if it arises as a type 11 individual. 
Simulation studies performed in Otto & Barton (1997) confirm these findings 
in the case ai > 02- 

Gillespie (2001) considers the effects of repeated substitutions at a strongly 
selected locus on a completely linked (i.e. there is no recombination) weakly 
se/ecied locus, extending his work in Gillespie (2000), where he considers a linked 
neutral locus. He too sees little dependence of his results on population size, 
leading him to suggest repeated genetic hitchhiking events as an explanation 
for the apparent insensitivity of the genetic diversity of a population to its 
size. Kim (2006) extends the work of Gillespie (2001) by considering the effect 
of repeated sweeps on a tightly (but not completely) linked locus. This whole 
body of work is concerned, in our terminology, with ai > (T2 . 
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The case of (72 > brings quite a different picture. The analysis used in 
Barton (1995) breaks down for the following reason: because the second bene- 
ficial mutation is more competitive than the first, type 10 is destined to start 
a sweep itself if it gets established in the population. Once Xxq reaches 0(1), 
Xqi is no longer well approximated by a logistic growth curve and in fact will 
decrease to 0. The fixation probability of type 11 will then depend on the non- 
linear interaction of all four types, {11, 10,01,00}, and our analysis will show 
that it is heavily dependent on population size. Sec Figure 1 below. 
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Fig 1 . Simulation results for fixation probability of type 1 1 for the following initial condition: 
the second mutation arises in a type 00 individual, when {2N)^''^ individuals in the population 
has the first mutation (i.e. are of type 01). Vertical bars denote two standard deviations. 
Parameter values: tri = 0.012, cr2 = 0.02, p = 4 X 10~^ (recombination coefficient). 

This paper is organized as follows. In §2.1 we set up a continuous time Moran 
model for the evolution of our population. In the biological literature, it would be 
more usual to consider a Wright-Fisher model, in which the population evolves 
in discrete, non-overlapping generations. The choice of a Moran model, in which 
generations overlap, is a matter of mathematical convenience. One expects sim- 
ilar results for a Wright-Fisher model. The choice of a discrete individual based 
model rather than a diffusion is forced upon us by our method of proof, but is 
anyway natural in a setting where population size plays a role in the results. A 
brief analysis of our model, for very large N , leads to our main rigorous result. 
Theorem 2.3, which provides a method to calculate the asymptotic (A^ oo) 
fixation probability of type 11 when (72 > (Ti- We discuss the case of moderate 
N in §2.3. The rest of the paper is devoted to proofs, with §3 containing the 
proof of Theorem 2.3 and §4 containing the proof of Proposition 3.1. Results in 
§4 rely on supporting lemmas of §5. 
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2. Main Results 

2.1. A Moran Model for Two Competing Selective Sweeps 

In this section we describe our model for the evolution of two competing selective 
sweeps. We use the notation from the introduction for the four possible types of 
individual in the population I = {00, 10, 01, 11}, and assume that at the time 
when the second mutation arises, the number U G {0, 1, . . . , 2N} of type 01 
individuals in the population is known. From now on we use t = Q to denote the 
time when the second mutation arises. As explained in §1, we may assume that 
U is much larger than 1. 

Let a S [0, 1] be the selective advantage of the second beneficial mutation and 
(77 be the selective advantage of the first beneficial mutation (for some 7 > 0). 
The recombination rate between the two selected loci is denoted by p which 
we assume to be o(l). We use {(??nCn)j = Ij ■ ■ ■ j 2iV} to denote the types of 
individuals in the population. At time t = 0, we assume that the population of 
2N individuals consists of 2N ~U ~ 1 type 00 individuals, U type 01 individuals 
and 1 type 10 individual. The dynamics of the model are as follows: 

1. Recombination: Each ordered pair of individuals, {r/mCm) and {rjnCn) G I, 
is chosen at rate p/{2N). With probabihty 1/2, {r]mCn) replaces (TymC™)- 
Otherwise, {r]nC,m) replaces {rimQn)- 

2. Resampling (and selection): Each ordered pair of individuals, {rjmQm) and 
{flnCn) G I, is chosen at rate 1/(27V). With probability piVmCm^rinCn) 
given by 

p{ij, kl) := i(l + a{i - k) + (T7(j - I)), 

a type {r]r,iCm) individual replaces (?7nCn)- Otherwise a type (^ynCn) indi- 
vidual replaces (rjmCm)- 

Remark 2.1. Evidently we must assume a{l + 7) < 1 to ensure that all prob- 
abilities used in the definition of the model are in [0, 1]. 

Remark 2.2. If p and a are small, then decoupling recombination from the rest 
of the reproduction process does not affect the behaviour of the model a great deal 
and it will simplify analysis. 

Let P denote the law of this Moran particle system, and r,^ and r^j be the 
rates at which Xij increases and decreases by 1/(2 A^), respectively, then 

r+ = AAio[(l + a)(l-Aio)-(T(l + 7)An-(T7Aoi] 
+pA(2AiiAoo + XiqXu + AiqAqo) 

rfo = AAio[(l-a)(l-Aio) + (T(l + 7)An+(T7Aoi] 
+pNXiq{Xoo + 2X01 + All) 
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r+ = NXoi[{l + aj){l^Xoi)-a{l+j)Xn-aXio] 
+pN{XooXoi + XiiXoi + 2X11X00) 

rg-i = NXoi[{l-<7j){l^Xoi) + a(l+j)Xn+<TXio] 
+pN Xoi{Xqo + 2X10 + Xii) 

r+ = ArXii[(l + a(l + 7))(l-Xii)-aXio- (77X01] 
+pX(2XioXoi + XioXii + XoiXii) 

rri = 7VXii[(l-a(l + 7))(l-^ii)+fT^io + T7Xoi] 
+pXXii(2Xoo + Xoi + Xio) 

r-+ = XXoo[l - Xoo - ct(1 + 7)^11 - f^^io - ^7X01] 

+pX(XoiXoo + XooXio + 2X01X10) 

r^Q = XXoo[1-Xoo + ct(1+ 7)^11 +f^^io + cr7Xoi] 



2.2. Analysis and Results for Large N 

We are concerned primarily with the case of very large population sizes, which 
is the regime where our main rigorous result, Theorem 2.3, operates. A non- 
rigorous analysis for moderate population sizes based on very similar ideas is 
also possible but will appear in Yu & Etheridge (2008). 

To motivate our result, wc present a heuristic analysis of the possible sce- 
narios. The proof of our main result fills in the necessary steps to make this 
rigorous. If the second beneficial mutation gives rise to a single type 10 indi- 
vidual, then the process whereby type 11 becomes fixed must proceed in three 
stages and our approach is to estimate the probability of each of these hurdles 
being overcome. First, following the appearance of the new mutant, Xio must 
'become established', by which we mean achieve appreciable frequency in the 
population. Without this, there will be no chance of step two: recombination of 
a type 01 and a type 10 individual to produce a type 11. Finally, type 11 must 
become established (after which its ultimate fixation is essentially certain). Of 
course this may not happen the first time a new recombinant is produced. If 
type 11 becomes extinct and neither Xoi nor Xio is one, then we can go back 
to step two. 

We assume the first mutation has been undergoing a selective sweep prior to 
the arrival of the second mutation. Before the arrival of the second beneficial 
mutation (during which Xio and Xn are both 0), wc can write 



where Mqi is a martingale with maximum jump size 1/(2X) and quadratic 
variation (Moi)(s) = ^Jv lo — Xoi(m)) du. i.e. (Mqi) is the unique 

previsible process such that Moi(s)^ — Moi(O)^ — (Moi)(s) is a martingale. See 
e.g. § II. 3. 9 of Ikeda & Watanabe (1981). We drop the martingale term Mqi and 



+pXXoo(Xoi + 2X11 + Xio). 



(2.1) 
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approximate the trajectory of Xqi using a logistic growth curve, i.e. Xqi{s) w 
1/(1 + i2N - l)exp(-cr7s)) which solves ^ = (77X01 (s)(l - Xoi(s)) and 
^01 (0) = 1/(2-^)- As discussed in §1, if we assume that the arrival time of the 
second mutation is uniformly distributed on the timecourse of the sweep of the 
first and N is large, then Xqi spends most of the time near or near 1. 
We divide into two cases. 

1. The second mutation arises during the first half of the sweep of the first 
mutation, i.e. when Xqi < 1/2. 

2. The second mutation arises during the second half of the sweep of the first 
mutation, i.e. when Xqi > 1/2. 

In Case 2, Xqi is close to 1 and it is most likely that the second mutation 
arises in a type 01 individual to form a single type 11 individual, in which case 
the fixation probability is roughly the same as the establishment probability of 
type 11 arising in a population consisting entirely of type 01 individuals, which 
in turn is roughly 2a/{l + a). 

From now on, we focus on the more interesting Case 1. In what follows, 
t = will be the time of arrival of the second beneficial mutation. There it is 
most likely that the second mutation arises in a type 00 individual resulting 
in a single type 10 individual in the population. If we approximate the growth 
of Xqi by a logistic growth curve, then it reaches 1/2 at time ^ log(2A^ — 

1) ~ ^log(2-/V)- Choosing the time of the introduction of the new mutation 
uniformly on [0, — log(2iV)] we see that at < = 0, Xqi w (27V) where C ~ 
Unif [0,1]. 

The establishment probability for type 10 in this case is relatively easy to 
estimate. Since a2 > (Ji, type 10 either dies out becomes established before 
Xqi can grow to be a significant proportion of the population. Therefore the 
establishment probability of type 10 is almost the same as a type 10 arising in 
a population consisting entirely of type 00 individuals, roughly 2tT/(l + a). 

We observe that if type 11 does get established, then since it has fitness ad- 
vantage over all other types, the probability that it eventually fixes is very close 
1 (this follows from Lemma 3.2). Therefore we can concentrate on the behaviour 
of X before Xu reaches say (log(2iV))/(2iV), which is still very small compared 
to 1. After type 10 is estabhshed and prior to type 11 being established, we 
approximate Xiq and Xqi deterministically. Until either Xiq or Xqi is 0(1), 
both grow roughly exponentially, so assuming that type 10 gets established, we 
have 

XiQ(t) « — e""*, Xoi(t) « — ^e'"'^*. (2.2) 

We divide Case 1 further into two sub-cases. See Figure 2 for an illustration. 
Case la, C < 7- The approximation (2.2) fails once either Xiq or Xqi reaches 
0(1), which occurs at time i log(2iV) A ^ log(27V). If C < 7, then Xqi reaches 
C(l) before Xiq, and will further increase to almost 1 (which takes time only 
C(l)) before Xiq reaches 0(1). At this time, which we denote Ti, the population 
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(a) Case la: C = 0.3, 7 = 0.6 (b) Case lb: = 0.7, 7 = 0.6 

Fig 2. Approximate trajectories of Xq\ (solid line) and X\o (dashed line) when X\\ is small: 
these curves are obtained assuming they undergo deterministic logistic growth with initial 
condition Xio(O) = {2N)-'^ and Xoi(O) = (2iV)-^ . Parameter values: a = 0.02, (2Af) = 10*. 
In Case la, Xq\ reaches almost 1 before being displaced by X\o, but in Case lb, Xqx never 
reaches 0(1). 



consists almost entirely of types 01 or 10. Type 10, already established but still 
just a small proportion of the population, will then proceed to grow logistically, 
displacing type 01 individuals until is close to 1 at time T2. During [ri,T2] 
(of length 0(1)), both X^i and Xio are 0(1), so we expect 0{pN) recombina- 
tion events between them producing 0{pN) type 11 individuals. Each type 11 
individual has a probability of at least 2a (1 + CT7) of eventually becoming the 
common ancestor of all individuals in the population. So if we want to get a non- 
trivial limit (as N — > 00) for the fixation probability of type 11, we should take 
p ~ 0{1/N). When we use the term nontrivial here, we mean that as iV ^ 00, 
(i) the fixation probability does not tend to 0, due to a lack of recombination 
events between type 10 and type 01 individuals, and (ii) nor does it tend to the 
establishment probability of type 10, due to infinitely many type 11 births, one 
of which is bound to sweep to fixation. 

Case lb, C > 7- In this case, Xiq reaches 0(1) at time roughly ^log(2A^), 
before Xqi does, and Xqi is 0((2A^)'''~'») at this time. Furthermore, the biggest 
Xqi can get is 0{{2N)''~'') since Xiq will very soon afterwards increase to almost 
1, after which Xqi will exponentially decrease (since type 01 is less fit than type 
10). Hence we expect 0{pN^^^^'^) recombination events between type 10 and 
type 01, and the 'correct' scaling for p \s p = 0(iV^~'^~^) in this case. 

In case la, we take p = 0(1/A^), then most of the recombination events 
between type 10 and type 01 individuals occur when type 10 is logistically 
displacing type 01, i.e. in the time interval \Ti,T2\. During this time, we can 
approximate Xio and Xm by Ziq and 1 — Zio, respectively, where Zi^ is de- 
terministic and obeys the logistical growth equation with parameter cr(l — 7), 
twice the advantage of type 10 over type 01. We can further approximate Xn 
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by a birth and death process Zn with deterministic but time- varying rates that 
depend on Ziq. Specifically, the rates of increase and decrease for Zn are the 
same as rf^ in (2.1), but with Xio replaced by Ziq, Xqi replaced by 1 — Zio and 
Xii replaced by 0. 

The probability that Xn gets established, i.e. reaches 

Sn = riog(27V)l/(2iV), 

is then approximated by the probability that the birth and death process Zn 
reaches Sn. The latter can be found by solving the forward equation for the 
process Zu, which can be found in (3.3). We define the fixation time of the 
Moran particle system of §2.1: 

Tfix = inf{t > : Xij{t) = 1 for some ij G /}. 

We observe that the Markov chain {Xqq, Xqi , Xio) has finitely many states and 
the recurrent states are R = {(0, 0, 0), (0, 0, 1), (0, 1, 0), (1, 0, 0)}. Every other 
state is transient and there is positive probability of reaching R starting from 
any transient state in finite time. Therefore 

Tfix < oo a.s. 

Our main result. Theorem 2.3 below, concerns Case la, which is the most likely 
scenario if 7 is close to 1. 

Theorem 2.3. If ( < j < 1 and p = 0{1/N), then there exists S > 0, whose 
value depends on p, a, 7, and such that 



'{Xn{Tf,x) = l)-^P^ll\T^) 



for sufficiently large N, where p'^^^\t) solves the forward equation (3.3). 

In the above, corresponds to the establishment probability of type 10, 

while p^^^^ {Too) approximates the establishment probability of type 11 condi- 
tioning on type 10 becoming established. Figure 3 compares fixation probabil- 
ities obtained from simulation, our non-rigorous calculation (which we briefly 
discuss in §2.3 below), and the large population limit of Theorem 2.3. In Fig- 
ure 3(a) we hold pN constant in this simulation, and observe that the fixation 
probability of type 1 1 increases but does not change drastically as N becomes 
large. The reason for the drop in the fixation probability of type 11 when A'' 
is small may be because in this case, the early phase for Xqi is very short 
and hence grows quickly to reduce the establishment probability of type 10. In 
Figure 3(a), we use a population size of 27V = 50,000 to approach the large 
population limit of Theorem 2.3. At 2N = 50, 000, it takes roughly 12 hours on 
a PC to obtain one data point in Figure 3, which is run with 20,000 realisations. 
Apparently this population size still results in underestimates of the limiting 
large population limit. 




Fig 3. Fixation probability of type 11: circles denote data points from simulations with vertical 
bars denoting one standard deviation, (a) varying population size: the solid line denotes prob- 
abilities obtained using our non-rigorous calculation, and the dashed line denotes the large 
population limit of Theorem 2.3, with p(2N) = 0.2. (b) varying p(2N): the solid line plots the 
large population limit of Theorem 2.3, and the simulation uses population size 2N = 50,000. 
Other parameter values: a = 0.02, = 0.3 and 7 = 0.6. 



We expect a similar result for Case lb, for which we provide an outline here. 
We take e < (7 — C)/(2 + 7) and ti = log(2iV), then at time ii, we expect 
to be either (with probability approximately as in Case la) or 

0{{2N)~') and Xqi to be roughly (27V)(i-'^)'^-^ < (27V)-2^ Since Xqi and Xn 
can be expected to be quite small before ti, they exert little influence on the 
trajectory of Xiq, which jumps by ±l/(2iV) at roughly the following rates: 

r+ w iV(l + (T + p)Xio, r^o w iV(l - cr + p)Xw 

Hence before ti, 2NXiq resembles a continuous-time branching process Z with 
generating function of offspring distribution in the form of u{s) = ^{\-\-a-\-p)s^-\- 
i(l - (7 + p) - (1 + p)s. Using Theorem III. 8. 3 of Athrcya & Ney (1972), wc can 
calculate i?[e~"^] for W = lim(_»oo e~'^^Z{t) and conclude that W is distributed 
according to j^^^5q{x) + exp( ^^^^^ x) dx for x > 0. Hence the conditional 
distribution fimction of Xio(ii)|Xio(ti) > resembles Exp C^^^^ {2N)-^), an 
exponential distribution with mean — i^t£(2iV)~'^, N 00. 

From time ti onwards, until either Xiq gets very close to or Xgi becomes 
much smaller than e'((27V)(i-")T-?), we can assume that the paths of X^i and 
XiQ resembles those of Zqi and Zio, respectively, where 

dZio = Zio[{l + a){l - Zio) - (T-fZm] dt 
dZoi = Zoi[{l + a - Zoi) - (J Z 10] dt 

with the initial condition Zio(ti) drawn according to Exp{^^^^^{2N)^'^) and 
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Zoiiti) ~ (2iV)'^^^-''^^'' . As in Case la, we can then approximate Xn by a 
birth and death process Zu with rates the same as rf-^ from (2.1) but with 
Xio replaced by Ziq and Xqi replaced by Zqi. The probability that Zu reaches 
6ii can then be found by solving the forward equation for Zu. Finally, we 
integrate this probability against all initial conditions for Zio, drawn according 
to Exp{ ^~^2(j''^ {2N)~'^). The proof of such a result is more tedious than that of 
Theorem 2.3 but makes use of similar ideas. 

2.3. Brief Comment on Moderate N 

For moderate population sizes, the observation in Case la of §2.2 that Xqi 
increases to close to 1 before Xio reaches C(l) breaks down. Wc can, however, 
compute the distribution function /t of the random time Tio-Sio when Xiq hits 
a certain level dio, assuming that Xqi grow logistically before Tio;^^. From 
T'iO;(5io onwards and before Xu hits ^n, Xio grows roughly deterministically, 
displacing both type 10 and type 00, so we can approximate Xu by Zu, a birth 
and death process with time- varying jump rates in the form of rf^ in (2.1), 
but with XiQ, Xqi and Xqq replaced by their deterministic approximations. 
Assuming Tio-g-^^ = t, we can numerically solve the forward equation for Zu, 
which is directly analogous to (3.3), to find the probability that Zu eventually 
hits Su, which wc denote by Pg^j'' {t). The dependence of p^^J' on t comes through 
the initial condition Xqi for the ODE system, which depends on Tiq-s-^^. The 

fixation probability of type 11 is then approximately ^ Pest {^)fT{t) dt. This is 
the algorithm we use to produce the solid line in Figure 3(a) and is given in its 
full detail in Yu & Ethcridge (2008). 

3. Proof of the Main Theorem 

We first define some of the functions, events, and stochastic processes needed 
for the proof, then give some intuition, before we proceed with the proof of 
Theorem 2.3. We begin by describing a deterministic process lio and a birth 
and death process Yu{t) which, up to a shift by a random time, are Ziq and 
Zu described in §2.2, respectively. They approximate the trajectories of Xiq 
and Xu, respectively, after the establishment of type 10. To describe the (time- 
inhomogeneous) rates we need the solution 




(3.1) 



to the logistic growth equation L{t; yo, 6) = yo+9 L{s; yo, 6){1 — L{s; yo, 6)) ds. 
In what follows, oo = C/(37) is a constant, ci, C2, C3 are constants (slightly 
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smaller than C(l)) that we specify precisely in Proposition 3.1, and 
, °0 1.-^9 An , i.0ilog(2jV) 

to = —l0g[2N ), tearly = —r^ ^ , (3.2) 

1 , 1-ci 1.02, 

tmid = —r. 7 log ,tiate = log(2A^). 

— 7j ci aj 

These deterministic times roughly correspond to the lengths of the 'stochastic', 
'early' (an upper bound), 'middle', and 'late' phases of Xqi, whose role is de- 
scribed in more detail in §4. During the time interval when Yiq is between ci 
and 1 — ci, whose length is exactly tmAd, there are birth events of Zn corre- 
sponding roughly to recombination events between type 10 and 01 individuals. 
For t g [0, tmid), we define 

Yioit) = L{t;ci,a{l~-f)) 
p+{z,t) = Nz[{l + a{l+^)){l-z)-{<j-p)Yio{t)-{a^-p){l-Yiom 

+2pNYw{t){l-Yw{t)) 
f3-{z,t) = Nz[{l-a{l+-f) + 2p){l-z) + {a-p)Y,o{t) 

+ (a7-p)(l-rio(t))], 

and for t > tmid, wc define 

Yioit) = 1 
/3+(z,t) = N{l + aj + p)z{l~z) 
p-{z,t) = N{l-aj + p)z{l-z). 

We then take Yii to be a birth and death process with birth and death rates 
/3+(Yii,i) and f3~{Yii,t), respectively (i.e. Yu jumps by ±1/(2A^) at rates 
P^{Yii,t) and /3+(lii,i), respectively), and initial condition yii(O) = 0. It 
is absorbed on hitting Su. 

It is convenient to write — k — 1/{2N) and fc+ = fc + 1/(2A^). Yu is run 
until time tmid + tiate- The probability that Yu hits Su before then can be found 
by solving a system of ODE's. Let p^^^^ satisfy 

j/,''\t) = p+ {k^,t)pi'^\t)+p~ ik+,t)pi';\t) - i(3+ik,t)+p-ik,t))pi''\t) 
for k = l/{2N),...,6u,- where Su,- = Su - 1/{2N), and 

j/o'\t) - P~ {l/{2N),t)p^^/^^^{t)- P+{0,t)pl^'\t) 

d 
It 



= P+{5u,-,t)pl'l_{t)~l3-{5uM!^(t) (3.3) 



with initial condition p^^^' (0) = l{fc=o}- Then 



^{Yu hits ^11 before t„vid + Uate) = ^^"^(irmd + Uate)- (3.4) 
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We use the following convention for stopping times: 

T^-j.,^ = 'mi{t > : > x}, Tz-x = inf{t >Q: Z>x} (3.5) 
SY.z.d^ff = inf{i > : ^ 

for any ij e {00, 01, 10, 11} and processes Y and Z, and define stopping times 

SiQ.Qi.rec — inf{t > : there is a recombination event 

between a type 10 and a type 01 individual before time i\. 

We define events 

El = {Xio(io)>0} 

E2 = {TlO;ci < T'll:l/(2W) ^ (^0 + tearly )} H {-'^'oi (T'iO:ci ) > 1 — Ci — C2} 

£^3 = {|^io(t) - ^10 (01 < C3 and Xooit) < for aU 

Aril;5ii]} 

^5 = {Xn(t) + Xio(t) > 1 - for aU t > Tz,,a^c,} 
Ee = {Xn(T^) + XioiT^) = 1} 

Er = {Xn{t) = Zn{t) for all t e [Tio,c„Too ^Tn-^s,,]} 
Es = {Tius,, < Too or Xn{Too) = Zii{T^) = 0}. 

We observe that T'ii;i/(27v) ^ •S'lo.oi.rec- First we outline the intuition behind 
these definitions: tg is the length of the initial 'stochastic' phase for Xxq. At 
to, with high probabihty Xiq either is 0((2iV)""-i) or has hit (event E^). 
In the latter case, there is no need to approximate Xi^ any further. On the 
other hand, if Ei occurs, then type 10 is very likely to be established by to and, 
with high probability, grows almost deterministically to reach level ci (slightly 
smaller than 0(1)) at time Tiq-,,^. Furthermore, as discussed in §1, in Case la, 
since C < 7, with high probability Xoi(rio:ci) is close to 1. Hence conditional 
on El , the event E2 is very likely. 
For paths in £^2 H -Ei , we define 

^io(TiO;ci +t)= Yw{t), Zii(Tio;ci + t) - Yii{t) (3.6) 

to be the approximations for the trajectories of Xiq and Xn, respectively, from 
time rio;ci onwards. For convenience, we define .^10 (t) — Zuit) — for t < 
Tio-ci- With the convention of (3.5), 

'^Zin;l — ci -^10;ci ^" ^midi 

and we observe that ^io(t) = 1 for t > Tz^a^i-ci- Since Xqi{Tiq._c^) sa 1, 
-'^oo [TiQ-ci ) is very small and is unlikely to recover because type 00 is the least 
fit type. During [Tio;ci, Tz^^^i-cJ, with high probability, type 10 grows logisti- 
cally at rate a{l — 7), displacing type 01. Hence conditional on Ei n i?2, E^ 
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is very likely. During [Tio;ci, 7zio;i-ci]i the definition of Zn takes into account 
recombination events between type 01 and 10 individuals that produce type 11 
individuals at a rate of p{2N)XqiXiq, which in the definition of Zn, is approx- 
imated by p{2N)Ziq(1 — Ziq). Notice that we can approximate Xqi by 1 — Ziq 
since we assume throughout that Xn < which is very small. Outside the 
time interval [Tio-cnTz-i^g-i-a], either Xio is very small or very close to 1 (which 
means Xqi is very small), hence we ignore any recombination events. Because 
Zii closely approximates Xu, conditional on E'a n n event £'7 has a high 
probability. 

After Tzia;i~ci, Xu + Xiq is likely to remain close to 1 (event E5) and hit 
1 at time Too (event Eq). We ignore any more recombination events between 
type 10 and 01 and Zu is a time-changed branching process during this time. 
If Zii has not hit by time Tz^^g^i-ci (event E4), then we continue to keep 
track of Zu until Too, at which time it most likely has already hit either Su or 
(event Es)- In the latter case, we regard type 11 as having failed to establish and 
since Xiq is most likely to be 1 (event Eq) at Too, the earlier mutation has gone 
extinct. If Xu hits Su before Too, we regard type 11 as having estabhshed and 
hence it will, with high probability, eventually sweep to fixation (Lemma 3.2). 

Proposition 3.1 below estimates the probabilities of events Ei through Es- 
These are 'good' events, on which we can approximate the establishment proba- 
bility of type 11 by the probability that Zu hits 6u by time Too. Proposition 3.1 
is essential for the proof of Theorem 2.3, and will be proved in §4. 

Proposition 3.1. There exists positive constants ^10.3 and (5io,4 > whose 
exact value depends on a, 7 and C, such that Ci,C2,C3 in the definition of 
El, . . . , Es are all < N~^"''^ and for sufficiently large N , 

(6) nE2^Ei)<Cp,^.,N-^^°^' 

(c) P(£;3^ n £2 n Si) < c^,^,.7V-*"''^ 

(d) v{El r\E4r\E3nE2nEi) < Cp,^.aN~^^°^^ 

(e) P (sg n £4 n ^3 n ^2 n £1) < Cp,^,^N-^''>-\ 

Consequently, we have (/) P (T^g n n £^2 H ) < Cp^^^aN~^^°-K Further- 
more, 

{g) ¥{E^ n £2 n £1) < Cp^-y^^{N-^"'--' + N-^'°'^) 

(h) r{El n £7 n £2 n £1) < Cp,^,„N-^'°-\ 



(a) 



\E'i) 



, 1 — 0-7 + 20 

Lemma 3.2. |P(Xn(T/„) ^ l)-V{Tu■s^^ < oo)| < iv'°s^+^. 

Proof. On {Tu-Sii < co}, Xu dominates Xu, a birth and death process with 
initial condition Xu{Tu;Sii) = Su = riog(2iV)] /(2A^), jump size l/(2iV), and 
the following jump rates 

f+ ^ N{1 + a7)Xn(l - ^n), f^, = N{1 - ^77 + 2p)Xn(l - Xu)- 
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Using standard Markov chain techniques, we may conclude 

1 1 — cr-y-|-2p 

which implies P({Xn(r/„) ^ 1, Tn^a^ < oo}) < (2iV)'°s^T^. Since {Xn(T/„) 
1, rii;5^j = cx)} is a set with probability 0, we have the desired result. □ 

Proof of Theorem 2.3. Recall from (3.2) that ao = C/(37) and to = ^ log(2iV). 
We first show that we can safely ignore Ef. Let 



Eq 



{Xii{t) = for all i < to}- 



Comparing with (2.1), we see that the jump process Xio with initial condition 
^io(O) — 1/{2N), jump size 1/(2A^), and the following jump rates 

f+ = iV(l + a)lio + 3piV, = ^(1 - 

dominates Xiq for all time. Then 

dXio = dM + {aXw + l-5p) dt 

where M is a martingale with maximum jump size 1/(2A^) and quadratic vari- 
ation {M) satisfying d{M) = 5^(2X10 + 3p) dt. Hence 



1 



3p 



2N 2a 



3p 



2cr 



1 



3p 



2N 2a 



We recall Burkholder's inequality in the following form: 
E 



sup\M{s)\P 



< CpE 



{M){tY'^ + sup \M{s) - Af(s-)|P 



which may be derived from its discrete time version, Theorem 21.1 of Burkholder (1973). 
We use this and Jensen's inequality to obtain 



E 



sup \M{s)\ 

S<to 



< E 



sup |Af(s)|' 

s<ta 



1/2 



<^(^l + N E[Xw{s) + 1.5p] ds 



1/2 



< £ + ^(pi„ + (7V-i+p)e'^*o)V^<c^_^^(a„/2)-i_ (3 7) 



Therefore 



E 


sup i'io(s) 


< E 


sup \M{s)\ 








_S<to 



+ 1.5pto + a / E[Xia{s)] ds < Cp,„N''^-\ 



Since Xiq dominates Xiq, we have 

supXio(s) > (2iV)2-o-i 
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On {sup^<(^j Xiq{s) < {2N)'^°-°^^}, the number of recombination events between 
type 10 and 01 during [0, ig] is at most Poisson{2p{2N)'^'^"^^to), hence 

F{EI n -El) < P{EI) < Cp^^iN-""" + 7v(2'Jo-i)/2) 

for sufficiently large N. On EgOEf, type 10 has gone extinct by time tg, before a 
single individual of type 11 has been born, hence type 11 will not get established, 
let alone fix. Therefore 

P{{Tn,s,, < oo} n El) < F{E^ n E^) < Cp^<,(Ar-«« + Ar(2ao-i)/2)^ (3 g) 

Now we concentrate on Ei where type 10 has most likely established itself at 
time to- The nontrivial event here is E'g H £^7 fl £^2 H £'1 . Let E^i = {Tn-s^^ < Too} 
and Es2 = {TivMi > Too,Xn(T^) = ZniToo) = 0}, then Es = Esi U Es2. The 
following events have small probabilities 

¥{E^ n El) < Cp^^,aN-^^°'^ 
F{{E^UE^)nE2nEi) < Cp^^^^iN-^"'''' + N-^'"'") 
F{Es2nE7nE^nE2nEi) < Cp.^,.A^-*"'-% (3.9) 

by Prop 3.1(b), Prop 3.1(g-h), and Prop 3.1(f), respectively, where the last esti- 
mate above comes from the fact £'§2 C E4. There are two events with significant 
probabiHties: on Es2 D Er n Ee D E2 Ei, we have Xii(Too) = 0, Xio(Too) = 1 
hence type 10 fixes by time Too, and on Egi H E-j (1 E2 C] Ei, Xu = Zn hits 
^11 and get established by time Too. On both these events, Xu = Zu until at 
least Too A Tn-Sn ■ The union of these two events, £32 H E7 H Eq H E2 Cl Ei and 
Egi n £7 n £2 n £1, and the three events in (3.9) is Ei. On £1 n £2, for exactly 
one of the two events {Tu-Sn < 00} and {Tz-i-^-Sn < Too} to occur (i.e. either 
the former occurs but the latter does not, or the latter occurs and the former 
does not), one of the following three scenarios must occur: 

1. Xu and Zu disagree before Too, i-e. E^; 

2. Xu and Zu agree up to Too, but do not hit {0,i5ii} before Too, i-e. E^; 

3. Xu and Zu agree up to Too and Xii(Too) = 0, but Xio(Too) < 1 thus al- 
lowing the possibility of type 11 being born due to recombination between 
type 10 and 01 individuals after Too, i-e- E^. 

Hence 

|P({Tn;A-ii <(x}nEi)-F{{Tz,,;S,, < T^} n Ei)\ 

< P(£^ n El) + F{{E^ u £7) n £2 n £1) + f{Es2 n £7 n n £2 n Ei) 

< Cp^^^aiN^^^"-'' + N~^^°'^) 

by (3.9). From (3.8), we have 

|P(Tii.5,, <oo)-P({Tn;a,, <oo}n£;i)| 

= P {{Tius,, < 00} n E^) < Cp.aN-"" -t- iv(2'^o-i)/2 
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But by Proposition 3.1(a), 

nEi) 



2a 



1 + cr 

We combine the three inequalities above to conclude 

2(7 



l + d 



for some 6 > 0, and then use Lemma 3.2, as well as (3.4) and (3.6) to obtain 
the desired conclusion. □ 



4. Proof of Proposition 3.1 



We divide the evolution of Xio and Xqi roughly into 4 phases, 'stochastic', 
'early', 'middle', and 'late', and use Lemmas 5.1, 5.2, and 5.3 for each of the 
last 3 phases, respectively. Lemma 4.1 deals with the early, middle, and late 
phases of Xqi. Because Xqi starts at J7 = {2N)~'' ^ 1/(2 A'^) at t = 0, it has 
no stochastic phase. Its early phase is between t = and the time when Xqi 
reaches coi.i. Its middle phase is between coi,i and 1 — coi,2, after which it enters 
the late phase. 

For type 10, since Xio(O) = 1/{2N), whether it establishes itself is genuinely 
stochastic (i.e. its probability tends to a positive constant strictly less than 1 as 
A'^ —f oo). The stochastic phase lasts for time tp, when, with high probability, 
either type 10 has established or it has gone extinct. If Xiq reaches 0{{2N)"'°~^) 
by time to, it enters the early phase, which is dealt with by Lemma 4.2. Part (b) 
of that lemma says that if ^ < 7 (as mentioned before, we only deal Case la of 
§1) then it does not reach cio.2 until Xqi has entered its late phase, while part (c) 
says that it does reach cio,3 at some finite time. The proof of Proposition 3.1(a- 
b) reconciles various stopping times used in Lemmas 4.1 and 4.2, and prepares 
for part (c) of Proposition 3.1, which deals with the middle phase of Xio during 
which Xio increases from cio.3 to 1 — cio.3, displacing Xqi in the process. The 
Cij^kS wc use throughout the rest of this paper are small positive constants, all 
of 0{{2N)^^^^-'' ), whose exact values are specified immediately below (4.2). 

Recall the definition of the logistic growth curve L{t] yo, 0) from (3.1). Through- 
out the rest of this section. We use L{t; {2N)~^ ,aj) to approximate the tra- 
jectory of Xio during its early phase and toi-^ to denote the time when this 
approximation hits x, e.g. ioi;coi 1 below is when it hits coi,i. Furthermore, we 
use tQi^x,y to denote the time this approximation spends between x and y. Thus 

L{tQi.x; {2Ny^, (77) = X and L{toi^x,y; a;, (77) = y. 
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We also define 

*01;l-coi.2 = ^01;coi,i + ^01,0.9coi,i , l-coi,2 • 

In the above, toi,o.9coi i,i-coi 2 the length of time for which we use the event A2 
in Lemma 4.1 below. On the event Ai defined in that lemma, Xqi reaches 0.9coi,i 
at time toi;coi 1 : after which event A2 ensures Xqi grows to levels slightly smaller 
than 1 — coi,2 after another time period of length ioi,o.9coi,i,i-coi 2 ■ Roughly 
speaking, the time when (2iV)~^, a^) is between 0.9coi.i and coi.i is counted 
twice. We observe that 



*01;coi,i 



1 (27V)« - 1 

— log — 1 T 

(T7 — t 1 



t' 



01;l-coi,2 



COI.I 

^01;l-coi,2 + ^ 



01,0.9coi,i ,coi,i 



(4.1) 



^ log 

(T7 



{{2Nf-l) 



- 1 



C01,2 



log 



0.9C01 



1 

coi.i 



Wc recall that ao = ^ and define the constants required for the rest of the 



proof, as well as ci, C2, and C3 as required by Proposition 3.1: 

^10,0 

&01.0 



WZ7 
47 4 ' 

ao + fli — 1, Oio,2 — -Z , Oio,3 



761 



90 



c 



Cl 



3 

7^10,2 



1 — ^01.2 — 
< I (&10,2 



7&I 



3 

&01.1 



^01,2) , <^io,; 



76 



10.2 



9 - 3 ' ' ' ' ' ' 60 

2A^cio,o(cio,o + coi,o), has = (ao - ai)/4, 



540 



(4.2) 



and Cij,fc = (2iV) ^^i.^. These choices imply oi + foio, 2 + 2/7 < 1 — C/7j which 
in turn implies the following: 



(1 - ai) log(2iV) + logcio.2 + - logcoi,2 > - log((27V)« - 1), 

7 7 



log((27V) 



l-a. 



log 



1 



0.9c 



10,2 



- 1 ) - - log 

7 



1 



- 1 



C01,2 



>^log((2^)'^-l) + ilogi^. 



log((27V)i-'^ 
1 



> 



7 



log 



((27V)C-1) 



log 



1 



0.9ci 



1 



C01,2 



log 



0.9coi,i 



- 1 



- 1 



'^^01;l-coi,2 (4-3) 



for sufficiently large TV. This will be needed in Lemma 4.2. 
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Lemma 4.1. Let i?oi = 7ii.i/(2W) A Tio;cio 2- define 

Ai = {Xoi{s) < 0.9L(s; (2iV)-«, (77) for some s < toi;coi,i A i?oi} 
^2 = {Xoi(.s) < L(.s-%;eoi,i;0.9coi.i,fT7) + (27V)-*'"-i /or some 

S e [ioi:coi,i' ^01:l-coi,2 Ai?Ol]} 
^3 = {^10(S) + ^01 (S) < 1 - (2iV)-^«i-i/2 /or some S e [toi;l^coi,2' ^lO.Ol.rec)}- 

T/ien 

(a) P(Ai) < Cp,^,,7V-(i-«)/4 

(&) P(A2 nAln {toi;coia < -Roi}) < {2N)-^°'-' 

(c) P(A3 n n n Kia-co,,2 < ^01}) < cn-'/\ 

Consequently, 

P((A3 UA,U A,) n < Roi}) < Cp,^,.(2Ar)-^oia. 



Proof. Early Phase. Before the stopping time i?oii the jump rates of Xqi 
satisfies 

4i > iVXoi[(l + CT7 + P)(l-^Ol)-l.l'TCio,2], 

< NXoi[{l-(J-f + p){l-Xoi) + l.lacw,2]- 

We take ^ = Xqi, a ^ 1 + p, 9 = a-f, 5q = l.lo-cio,2, Si = C014, <52 = (1 - C)/4, 
y such that y(t) = {2N)-i + J^^ Y (s) {a-f {1 - Y{s)) - l.lcrcio,2) ds, and uq = 
M{t:Y{t)=6i}>toi 

;coi,i Lemma 5.1 to obtain 
P (Xoi(s) < 0.99y(s) for some s < ioixoi.i A i?oi) < Cp^^^aN''-^-'^^/'^ . 

Prior to uq, Y is sandwiched between L(-; (2A^)^'» , 0-7— 1.2ctcio,2) and L{-; {2N)^'^, aj). 
Since L(t; (2iV)-«, (77) - L(t; (2iV)-?, (77 - w) < (1 - e-'"*)L{t; {2N)-i,a"/) for 
I' ^ cr7, we have 

Yit) > i(t;(27V)-«,a7-1.2acio,2)>e-i-2-^"''^*L(t;(2iV)-?,a7) 
> 0.99L(t; (27V)-'^,cr7) 

for t = ©(log AT). Hence (a) foUows. 

Middle Phase. Before Rqi, Xu = 0. Using the jump rates of Xqi in (2.1), we 
can write 

^01 (^ A i?oi) = 60 + Moiit A i?oi) 

ptARoi 

+ / Xoi(s)[a7(l - Xoi(s)) - (a + p)Xio(s)] ds, 
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where Afoi(- A -Rqi) is a martingale with maximum jump size 1/{2N) and 
quadratic variation (7\/oi)(t A i?oi) = ^jjf l^"^^"^ -''^oi(s)(l — Xoi(s)) ds. We ap- 
ply Lemma 5.2 with bo = Xoi(toi;coi,i), "i = *oi;coi,i, "2 = *oi;i-coi.2' '^i = ^10,2. 
^2 = 00, eo = 601,1, ei = 601,2, £3(0 = £4(0 = 0, T = i?oi and L'l = Al. Then 
since 610,2 > (^01,1 + ^01,2) A i, we have 

P({|Xoi(s)-i(s-toi;coi.i;Xoi(ioi;coi.J,fT7))l > {2N)-^°''' for some 
s € [toi ;coi,i 7 ''Olil-coi 2 

A i?oi]} n n {toi;c„i,i < Roi}) < (2iV)-*«i'\ 

where ^oi.i is defined in (4.2). Now for paths in Al n {toi;coi 1 -Roijj we have 
-^oi(ioi;coi.i) > 0.9coi,i and hence 

Lis ~ ioi;coi,i ; -'^01 (i01;coi,i ): ^^l) > L{s - toi;coi,i ; 0.9coi,l, (T7). 

The desired conclusion in (b) follows. 

Late Phase. On n n {ioi:i_coi 2 — ^01}, since (5oi,i < 601,2, we have 

^oi(toi;i-coi,2) > i(ioi;i-co,,2 -ioi;coi,i;0.9coi,i,a7) - (27V)^*-.i 

> 1 - coi,2 - (2A^)"''''^'^ 

Therefore Xm{t'Q^.^_^^^ ^) < 2{2N)-^'>'-' . Before 5io,oi,rec, ^11 = 0, and the 
jump rates of Xoo satisfy 

r^o < ^(1 - ^7 + p)Xoo{l - Xoo), r^^ > N{1 + aj + p)Xoa{l - Xoo). 

By Lemma 5.3, P({sup,>e Xoo(t) < iV-^-^^/^} n n n {t[,i^i_,„^ ^ < 

^01}) < CN^^^^, which implies the desired conclusion in (c). □ 

For the remainder of this section, we define the following events 

Ail = {Xiois) > (27V)°''+'^i-i for some s < io A Toi;,„,,„ A Tii.i/(2W)} 

A42 = {Xio(to)e [i,(27vr-"i-i]} 

Ai = A41 U A42 U 

B4 = {to < Toi:coi,o A Tii.i/(2Ar)} 

A51 = {Xio(s) > cio,2 for some s e [to, ioia-coi,2 ^ Tii;1/{2N)]} 

^52 = {T"l0;cio.3 ^ ^"11:1/(2^) > ^0 + tearly}- 

Lemma 4.2. Recall that to = ^log(2A^), (5io,o = 2iVcio,o(cio,o + coi,o), ^10,1 = 
(ao — ai)/4. We have 

(a) P(A4i) < 2<5io,oio + Cp,-,^^N'''' 

P(A42 n Al, n B4) < Cp^^,aN-^' + 2Sio,oto 

l-a + p 



¥{E1 n S4) 



l + a + p 

(6) P (A51 n n B4) < Cp,^,,Ar-*i'''i 
(c) P(A52 n n B4) < Cp,7,^A^"*'"'' • 



< 6<5io,oto + Cp,^,.iv-'^^ + P(s^) 
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Proof. Stochastic Phase. Wc define i?oi.ii — ^oijcoi.o ^ 2^ii:i/(27V)- Before 
Tii-i/[2N}i tti6 jump rates of Xiq are as follows: 



' 10 



NXwiil + T + - Xio) - {<yi + p)^oi], 



rfo = NXw[{l-^ + p)(l-Xw) + {(ri + p)X, 



oij 



We define 77 to be a jump process with 77(0) — 1/{2N), jump size 1/{2N) and 
jump rates as follows: 



i),10 



Nt]{1 + a + p), r,, 10 = ^^(1 - ^ + p)- 



then prior to Sxia,v,diff A Tio;cio,o A i?oiai, we have |r+ - r+^g] < Siq^q and 
jrj^Q — ?',7iol 1^ ^10,0- Therefore |Xio — r^j is a jump process with initial value 
0, jump size 1/(27V) and jump rates at most 2(5io.o, and we can estimate the 
probability of \Xiq — 77 1 becoming nonzero before to'- 



P {S X 10, ri, di f f < A TlOicio.o A -Roi,ll) < 2^10, 0*0- 

Since ry is a branching process. Lemma 6.1(a) implies 
P (sup 7/(s) > (2A^)'^"+''i-M < Cp.^.^N-"^ 

\s<to J 



(4.4) 



1 <??(to) < (2^)""""^"') < Cp.^.^N-'^' 



\v{to) = 0) 



1-a + p 



l + a + p 



< 



l_a+_p_ 

l + a + p 



Using (4.4), we can replace 77 in the above three estimates by Xiq if we allow 
an additional error term. In particular. 



sup Xio(.s) > cio,o = {2NY°+''^-^ 

,s<to A-Roi,ii / 



sup -'^^io(s) > Cio,07 Sxio,ihdiff <to A T'iO;ciQ,o A i?01,ll 

, S<toA/?ol,ll 



+P sup Xio{s) > Clofi, Sxio.V.diff > ^0 A riO;cio.o A i?oiai 

\s<toA_Roi,ii 



< 2SwMo + IP sup 77(5) > cio,o < 2(5io,oio + Cp^^^^N-"^ . 

ys<toA_Roi,ii J 

Similarly, we can obtain the second statement of (a) and 

l-a + p 



\{Xioito)^o}nAl^nBi) 



l + a + p 



<2Voto+P(^4i)+P(S|), 



which implies the third statement in (a). 
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Early Phase (Upper Bound). Before Tii._i/(^2N)^ the jump rates of Xio satisfy 

rtn <N{l + <j + p)Xw{l - Xio), r^o >N{l-a + p)Xio{l - X^). 

We take I = Xw, a = = <t, (5o = 0, 5i = 0.9cio,2, ^2 = ho.i = (ao-ai)/4, 
y(t) = i(t;Xio(to),2cr), and mq = Rw = inf{t > : Xio(to), 2cr) > (5i} in 
Lemma 5.1 to obtain 

P({Xio(to + s) > 1.01L(.s;Xio(to),'T) for some s > (i?io A rii.i/(2Ar)) - to]} 

On A%r]BiC {Xio(to) e ((2Ar)ao-ai-i^ {2NY'>+''^~^)}, we have 

to + Rio > — 
a 

> f' 

by (4.3) and the definition of toi.i_coi 2 Hence if 

Xioito) e ((2Ar)"«-"i-\(2Ar)''"+"i-i), 

thenL(t[)i.i_^^^^-to;-'^io(to),CT) < L (i?io ; Xio (to), cr) = 0.9cio, 2, which implies 
(b). 

Early Phase (Lower Bound). Before Tii.i/(^2N)j the jump rates of Xi^ satisfy 

rto > N{1 + a(l - 7))^io(l - ^10), r" < N{1 - a{l - 7) + 2p)Xio(l - X^o). 

We take ^ to be Xiq shifted forward in time by to, ct = 1+p, 6 = (t(1— 7)— p, Sq = 
0, Si ^ L01cio,3, S2 = ^104 = (ao - ai)/4, ^(t) = L(t; iV''«-''i-i, a(l - 7) - p), 
and Uq = inf{t : Y{t) ~ l.Olcio.3} in Lemma 5.1 to obtain 

P({Xio(to + s)< 1.005L(s; N''»-^'~\a{l - 7) - p) 

for some s < A (Tn;i/(2Ar) - to)} ClAln B4) < Cp^^^^N-^'"'^ 

Since uq < teariy — g-(i^°")_p log(2-/V), the conclusion in (c) follows. □ 
Proof of Proposition 3.1(a-b). We define t2 = to + t^ariy and 

-£■21 = {T'iOxio.s !i 'S'lO,01,rec A t2} , 

i^22 - {^oi(Tio;cio,3) > 1 - cio,3 - (27V)-^-.i/2|^ 

Fl = |'5'lO,01,rec < TiQ-cio,3 A (t2 V toi- i_coi,2 ) } ' 

then £^21 n i?22 C i?2- Before rio;cio,3 A (t2 V t'Q^.^_^^^_^ ^), the rate of recombi- 
nation events between type 10 and 01 individuals is at most ApN XiqXqi < 
ApNN~^^°-^ < CpN~^^°'^ . Hence the total number of recombination events be- 
tween type 10 and 01 individuals before Tioido.s A (^2 V t'Qi.i_^^_^_^ ^) is dominated 
by a Poisson random variable with mean C p^^,cN^^"''^ \ogN . Therefore 

P(-Fi) < Cp,-,,<,iV-^''i°-^/^ (4.5) 
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On Ff , we have 510,01, rec > T'io;cio 3 or 5io.oi,rec > ^21 We observe that 

;cio,3 »S'io,01,rec V t2} n {^10 ,01, rec Via} 

,01, rec 

> t2, riO;eio,3 > t2}- (4.6) 

Therefore Lemma 4.2(c) imphes 

V{EI^ n nAin B^) < Cp,^^^N-^">-' . (4.7) 

Let F4 = {Tioxio 2 — ^011-coi -3 ^ T'ii;i/(2Af)}: then reasoning similar to that 
of (4.6) imphes 

-F4 n {ioi;i„coi,2 - ^ll;l/(2Ar) V Tio^io.a} ^ {riO;cio.2 V ^oia-coi.a ^ T'll;l/(2Ar)}, 

which imphes 

P({toi;i-c„i,2 > Tna/(2N) V Tio;cio,2} n F21 n n n B4) 

< IP({riO;cio,2 V ioi;l_eo,2 > 7^ii;i/(2W)} ^ i?2i H H H B,) 

+P{F4r\Alr\B4). 

The first set on the right hand side satisfies 

{^10;cio,2 V ioi;i_coi.2 ^ ^11;1/(2W)} H F21 H F^ 

C {Fi0;cio,2 V toi;l-coi,2 - ^11;1/(2W) > T'iO;cio,3 V ^2} H E21 
C {Fi0;cio,2 V ioi;l-coi,2 - ^11;1/(2W) > t2> TlO:cio,3} = 0J 

therefore 

P({t^,i;i_coi,2 > Tii.,i/i2N) V rio;eio,, n F21 n Ff n n ^4) 

< P(F4 n n B4) < Cp,-y,^N^^">-' (4.8) 

by Lemma 4.2(b). On {t'oi.i_coi,2 < ^ii;i/(2Ar) V Tio;cio,2}, we have rio;cio,3 > 
TiO;cio 2 ^ ^oi:i-coi 9' therefore Lemma 4.1 imphes 

P({^10(ri0;Cio,3) +^0l(Tl0;Cio,3) < 1 ^ (2iV) "^^ ' ^ } (49) 

n{t[,i.i_,„^^ < Tn;i/(2W) V Tio;eio,2} n F21 n f^ nAln B4) < Cp,^^N-'»'-K 

Combining (4.5), (4.7), (4.8), and (4.9) yields 

P((F^2 u F^i) n n n S4 n Fi) < Cp^^^iN-^"^-^ + n-^^"-^ + iv-^f-ica/S)^ 

where we also recall from Lemma 4.2 that A4 = A%^ n A42 fl Fi. We further 
combine the above estimate with the first two statements of Lemma 4.2(a) to 
obtain 

P((^^22UF2^)nS4nFi) 

< Cp^^aiN-^"^-^ + N-^^°-^ + AT-^fcio.a/s ^^^^^^^ _^ N-"^). (4.10) 



Yu, Etheridge & Cuthbertson/4 PROOF OF PROPOSITION 3.1 



24 



It remains to show that B% = {to > Tqiicoi.o ^ ^ii;i/(2W)} has a smah prob- 
abiUty. Let F2 = {To^cm.o < A rii;i/(2W)}- Before Tn-i/faAr), the jump rates 
of Xqi satisfy 

4i < N{1 + (77 + p)Xoi(l - Xoi), > N{1 - (77 + p)Xoi(l - Xqi). 

We take ^ = Xqi, a = 1 + p, 6* = 0-7, (5o = 0, (5i = 0.9coi,o, ^2 = (1 - C)/4, and 
= L{t\ i2Ny^, (77) in Lemma 5.1 to obtain 

P (^01 (s) > coi.o for some s < toi;0.9coi,o A Tii-i/i2N)) < Cp-,^,N-^^-^^/\ 

By the choice of ao in (4.2), to = log(27V) - ^ log(27V)«/3 < ioi;0.9(2iV)-c/3 = 
ioi;0.9coi,o; therefore 

Wc observe that i3| n C {to A Toi;coi 1 > 2^ii;i/(2Af)}- By an argument similar 
to the one leading to (4.5), P(B| n F^) < Cp.^^^N''-/'^, which implies 

HBD < Cp,^,.(iV-(i-';)/4 + N-</^). (4.11) 

Combining (4.10) and (4.11) yields the desired result in (b). For part (a), we 
combine the third statement of Lemma 4.2(a) and (4.11) to obtain the desired 
result. □ 

Proof of Proposition 3.1(c-e). Recall that Zw{TiQ-cio.i +^) = -^(^5 ^10,3, (7(1—7)) 
for t e [Tio;cio,3> 7zio;i-cio,3]: and Tzio;i-cio,3 = Tw,c^o.3 + ■^{T=r^ log . We 

work on t > Tio;cio 3 throughout this proof. On E2nEi, we have Xoi(Tio;cio 3) ^ 
l-cio,3- (2Ar)'-*'"'i/^ Xio(rio;cio.3) = cio,3 and Xoo(Tio;cio,3) < {2N)~^'>^^^/\ 
We can then write down the following equation using the jump rates of Xiq 
in (2.1): 



Xioit) = cio,3 + Mio(t)+ / Xio(s)[(7(l-7)(l-^io(s)) 

-ctXii(s) + (77X00(5)] +p(Xii(s)Xoo(s) - Xiq{s)Xoi{s)) ds, 

where Miq is a martingale with maximum jump size 1 / (2N) and quadratic vari- 
ation (Mio)(t) = ^ /* {1 + p)Xio{s){l - Xio{s)) + pXn{s)Xoo{s) ds. We 

use Lemma 5.2 with 6 ^ (7(1 — 7), ui = 0, 112 = , log ^ Si ~ ^oi.i/4, 

62 = 00, eo = £1 = ^10.3 = (501. 1/10, T = Tii;5j^ AToo.(2Ar)-5i , £2(0 = -(7Xii(t) + 

ajXooit), e^it) = p(Xn(t)Xoo(t)-Xoi(t)Xio(t)), £4(0 = Xn(t)Xoo(t), r(t) = 
■^loC^^iOxio 3 + and Z^i = E2 H Ei to obtain 

P(|Xio(s,cj) - Zio(s,cj)| > (2iV)~*i°-^ forsomctjG£;2ni;i, (4.12) 
s € [rio;cio,3,?Zio;i-cio,3 ^Tii-Sii A Tpp.(2Nj-5oi,i/i]) < (2A^)~*'i"'% 
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where i5io,2 = (<5i — ei — e2)/3 = (5oi,i/60, as defined in (4.2). The jump rates of 
Xqo satisiy 

r^o < N[{l-<jj + p)Xoo{l-Xoo) + 2pXoiXio],roo > N{l + <jj + p)XoQil-Xoo). 
On E'z n Si, we have Xoo{Tio.^^„ ,^) < {2N)-^°^'^/'^ . Therefore by Lemma 5.3, 

P ( i sup Xoo(s) > (2iV)-''«i-i/4 \f^E2f^El \ < CN-^I"^. 

We combine the above and (4.12) to arrive at the desired conclusion of (c). 
For (d), we observe that the jump rates of Xn+io = Xn + Xiq satisfy 

rti+w = A^Xn[(l + tT + /7)Xoi + (l + a(l + 7) + 2p)Xoo] 
+NXw[{l + (7(1 - 7) + 2p)Xoi + (1 + a + p)Xoo] 

r-n+io = NXii[{l-a + p)Xai + {l-a{l + -i) + p)Xoo] 

+7VXio[(l - ct(1 - 7) + p)Xm + + p)Xoo], 

where we drop the terms involving XhXiq in rj^ and r^i^ which correspond 
to type 11 individuals replaced by type 10 individuals or vice versa. Therefore 
Xii+io dominates 1 — rj where we define 77 to be a jump process with initial 
condition ?7(rio;cio.3) = 1 - -'^ii+io(rio;cio,3) and jump rates of 

r+ = N{1 - a{l - 7) + p)t](1 - tj), = iV(l + a{l - 7) + p^l - tj). 

Since r]{Tzio\i-cio,3) < 1 - -'^io(Tzio;i-cio,3) < cio,3 on £^4 n £^3 n £^2 H £1, by 
Lemma 5.3, 

P {Mt) > for some t > Tz,„■,l-c^o.3} n £4 n £3 n £2 H £1) < CN^^^^. 

This implies the desired conclusion of (d). 

Let ?y be a time change of 77 by 1 — 77, then 2iV77 is a branching process and 
the clock for fj runs at the rate of at most 1.02 times that of r] on {fj^t) < 
y/o[^ for all t > Tz,„;i-cio.3} n £4 n £3 n £2 n £1. By Lemma 6.1(b), 

P({ry (rzio;i-cio,3 + OMtiate) > 0} n £4 n £3 n £2 n £i) < CiVcio,3e" 

Hence P {{rj {Tz,„;i-c,o.3 + Uate) > O} n £4 n £3 n £2 n £1) < Ccio,3, which 
implies (e) since Tzi„-i-cio.3 + ^late = Too- □ 



Proof of Proposition 3.1(g-h). We define C4 and (5io,4 such that C4 — max(y^ + 
^11, C2, C3) < A^-2'5io.'4 and we let 

Sx.zjar = inf{< > rio;ci : \Xw{t) - Zio{t)\ V Xoo(t) > C4}. 
By Proposition 3.1(c,d), there exists ^10.3 > such that 

mSx.ZJar < TllAjn£2n£i) 

< P((£3^ u [El n £4 n £3)) n £2 n £1) < Cp,^,<,iv-^l«■^ (4.13) 



Yu, Etheridge & Cuthbertson/4 PROOF OF PROPOSITION 3.1 26 

where we have used that on E/^ n i?3, Sx,z.far > 7zio:i-ci and on E^, XiQ{t) > 
1 - V^- Xn(t) > 1 - V^T- <^ii and Xoo{t) < 1 - Xw{t) - Xii{t) < ^ for 
t > Tz,a;i-cr - Notice that on E2nEi, Xii(t) = = Zii{t) for all t < T^-ct- 
For t < Sx^i^ZiiMff A Sx,z.far A Tii-Sii, we have 

kz4i - ■''nl < NSii[{a - p)3ci + dn] + 2pN{3ci + Sn) < ANSnc^ 

and similarly, \r'^ n — r^i\ < 4iV(5iiC4. Thus the absolute difference between 
Xii and Zii is bounded above by a Poisson process of rate 8NS11C4, which 

1 /2 

stays during [Tm-ci , T'iO;ci + imid + tiate] with probability at least 1 - C4' , if 
irmd + Uate < C4 which is Satisfied by our choice of Unid + Uate = ©(log A^). 
Hence 

n{Sx,^,z,^Mlf < Too A Sx,ZJar A J H H ^1) < cj\ 

We combine (4.13) and the above estimate to obtain 

n{Sx,,,z,ud^^^ <Too/\ T^iMi} n ^2 n e^) < c^i^ + Cp,^,,7V-'^l«■^ 

which implies (g). 

Let ^3 = {T2^i;{o,5ii} > Tzio;i-ci}- Starting from Tz^^-i-c^, ^11 is a time- 
changed branching process. We perform a time change of 1 — Zn (from time 
T!zio;i-ci onwards) to obtain a branching process Zn, then the clock for Z\\ 
runs faster than that of Z\\ (at a rate of at most 1/(1 — times before Z\\ 
reaches From time Tzioii-ci onwards, and b\\ are absorption points for 
Zx\{- A We use Lemma 6.1(d) below to deduce that 

P({Zii(Too aTziiaJ e (0,(5ii)}nF3n^2nSi) 

< P ({Zn(s) G (0,(Sii) for aU s < (1 - bx^)T^} nF^nE^^n Si) 

< (2Ar,5ii)2Cp,^,„exp(-0.99a7(roo - Tzi„;i-cJ), 

< Cp,^^<,(log2 TV) exp(-0.99(77i,ate) < Cp,^,,iV-^l«•^ 

if we choose a sufficiently small (5io,4. Therefore 

P({Zii(roc,) e (o,(5ii), Tzii;5ii >T^.}nF3nE2nEi)<Cp,^,^N-^'«-\ 

On > Too Arii;iii}, Xii and Zn agree up to Too ATii;^^^. There- 

fore 

^ i{Sxii,ZiiMff > TocTii-Sii > Too, XiiiTao) = Zii{Too) G (0,(5ii), 

We can drop the condition rii;{o,5ii} > T'zioa-ci, since on {^Xn.Zn.di// > 
T'oo,T'ii;5ii > T'oo,rii;{o,<5ii} < T'zioa-ci}, we have Xii(Too) = ZuiTao) = 0. 
Hence 

P ({Tii;5„ > Too, Xii(Too) = Zii(roo) e (0, (5ii)} n S7 n S2 n Si) < Cp,^,,A^"*i' 

which implies the desired result in (h). □ 
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5. Supporting Lemmas 

In this section, we establish Lemmas 5.1 to 5.3, one each for the early, middle, 
and late phase. They are used for the proof of Proposition 3.1 in §4. Lemma 5.1 
deals with the early phase and approximates a 1-dimensional jump process un- 
dergoing selection by a deterministic function, where the error bound depends 
only on the initial condition of the process, as long as the process is stopped 
before it reaches C(l). Lemma 5.2 deals with the middle phase and uses the 
logistic growth as an approximation. The main difference between the early 
phase and the middle phase is the error bound: in Lemma 5.2, the error bound 
depends on both the initial and terminal conditions of the process. Lemma 5.3 
deals with the late phase, for which we only need to show that the process does 
not stray too far away from 1 (or for Xqq) once it gets close to 1 (or 0). 

Lemma 5.1. Let a > 1, 9 E (0,1), Sq £ [0,1/2] and x E (0,1] be constants. 
Let 5 be a jump process with initial value ^(0) = {2N)^^ > {2N)^^, jump size 
1/(27V), and jump rates 

r+ = N£,[{a + B){1 - ^) - So], r" = Ni[{a - 0)(1 - C) + 5q]. 

Suppose Y is a deterministic process that satisfies 

r(t) = (27V)-^^ + / Y{s){e{l~Y{s))-5o) ds. 
Jo 

If no = mf{t : Y{t) = Si} < (log 2)/(36l(5i + So), then there exists S2 € (0, (1 - 
x)/4] such that 

P - Y{s)\ > m-^^Y{s) for some s < uq) < Ca,eN-^\ 

Moreover, if ^ and ^ are jump processes such that before a stopping 

time T, then P {^{s) < (1 - 'iN^^'')Y{s) for some s<uqAT^ < CafiN'^^ 
and P {i{s) > (1 + 47V"*2)y(s) for some s<uqAT) < Ca,eN-^\ 
Proof. Wc can write 

rfe = dM^ + -0- So) dt, d{M^) = ^e(i - o dt, 

and consequently, 

die-''m) = dAkit)~e-'\eat)^+Som) dt (5.1) 

d{A%){t) = ^e-''>'am-m)dt. 

We define r = mf{t < uq : £,{t) > 2Si}, and take expectation on both sides 
of (5.1) to obtain 



^[g-fl(*Ar)^(^ A r)] = (2Ar)-^ - E 



e-'^eas)' + Soa^)) ds 



< i2N)- 
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As in the steps leading to (3.7), we use Jensen's and Burkholdcr's inequalities 
to obtain 



E 



sup \M^{s)\ 

S<t/\T 



e-2e«e(s)l{.<r} ds 



1/2 



- § + ]^ (/ ^"'^(27V)^' ^ a,«A^-(i+-)/^ (5.2) 

Since de-^*r(t) = -e-^^OY {tf + 5qY {t)) dt, we use (5.2) in (5.1) to obtain 



E sup e-^'l^is) -Y{s)\ 

_S<tAT 



e-'-'{0m'-Y{sr\+6om-Y{s)\ ds 



{30Si + (5o)e-^^|C(s) - r(s)|l{,<,} ds 



< Ca,eAf-(i+")/2 ^ / {mi+5o)E 
Jo 



sup e''"'\^{s)-Y{s) 

s<s' At 



ds'. 



Gronwall's inequality implies 



E 



sup e-<'^m-Yis)\ 

S<tAT 



since t < uq < (log2)/(36l(5i + ^o)- Let S2 G (0, (1 - x)/A], then 

P ~Y{s)\ > N-^^-^'e^" for some s < uo A t) < Ca^eA^^'^^ 

We observe that for s < uq, {2N)-^tJ'^-''^^-^«> < Y{s), hence N'^'e''' /Y{s) < 
2Xgie5i+So)s ^ 2^e^^^^~^^°^^^°^'^^^^^^^^'^^°^ < 4, i.e. N'^e^'' < AY{s). Hence 



P (|^(s) - Y{s)\ > 4N~^-'Y{s) for some s < uq A r) < Ca,eN-^\ 

We can drop r in the event above, since |^(t) — ^(t)| > Y{t). The conclusion 
follows. □ 

Lemma 5.2. Lef 0, eo,ei G (0,1) and ao,ai > 6e constants. Suppose Y is a 
deterministic process defined from a stopping time ui onwards that has initial 
condition Y(ui) = bo > ao{2N)~'^° and satisfies 



Y{t)^bo+ / eY{s)il-Y{s)) ds. 

Letu2 = ui + ilogi^i^ such thatY{u2) = l-bi< 1 - ai(27V)-"i . Suppose 
T is a stopping time and ^ is jump process that takes values in [0, 1] , has jump 
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size 1/(2N) and satisfies 



tAT 



at AT) = aui) + MitAT)+ ^(s)[0(l-e(s)) + e2(.s)]+e3(s) 



{M){tAT) = 



l + P 
2N 



t/\T 



e(s)(l-^(s))+e4(s) ds, 



where \e2{t)\, |e3(i)| < {2N)^^'- , €4(1) < 1 fort < T, and M is a jump martingale 
with jump size 1/{2N). Furthermore, suppose on a set Di e J-{ui), we have 
— &o| 5: (2A^)^''^. We define D2 — {T > ui} and to be a constant 
< {{Si A ^2 A i) - eo - ei)/3. If S3 > 0, then 



P[{ sup \as,uj)-Y{s,Lu)\> {2N)-^^\nDinD2\ <i2N)-^^ 



Proof Let D = DinD2. Notice that D G J^(tii). Since 

mm - m) + ^2it)] - erm - vm^.^T} 

< i2N)-'^ + em - Y{t)\\i - m - mii{t<T} 

< i2N)-'^+9m-Yit)m,<T}, 

we have 



mui + 1) A T) - r(K + i) A t)\1d < mi) - 

+ |A/((wi + t) A T)1d\ + / [(27V)-*i + em) - i^(5)|]lD ds. 



By Jensen's and Burkholder's inequahties 
E 



sup |Af(sAT)lD| 

ui <s<ui4-£ 



therefore 



sup mAT)~Y{sAT)\lD 

_Ul ^S<U]_-\'t 



<C^-+E[mi)-Yiu,)\lD] 



+2{2N)-'h + / eE[m) ' >'(s)|1{.<t}1d] ds. 

Since E{m) ~ >'(s)|1{.<t}1d] < E{m AT)- Y{s A T)\\dI and m^) ^ 
Y[ux)\\d < (2iV)-''^ we have 



sup mAT)-Y{sAT)\\D 



< c 



1 



iV (27V)*2 (27V)*i 
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by Gronwall's inequality. Wc observe that M2 — mi < ^ log < |[(eo + 
ei)log(2A^) — log(aoai)], therefore the estimate above imphes 



E 



sup AT)-y(s AT)|lz) 

Ui<.S<.U2 



< 



aoGi 



Since < S3 < {{61 A (52 A ^) — eo — ei)/3, we have 



E 



sup |C(sAT)-r(.sAT)|lz5 

Ul<S<li2 



< i2N) 



-2S3 



which impUes the desired conclusion. 



□ 



Lemma 5.3. Let a > 1, 6' G (0, 1), x £ (0, 1], c > and k > be constants. 
Let rj < rj be jump processes where rj has initial value t]{0) = 1 — c{2N)^^ , jump 
size \/{2N), jump rates 

r+ = N{a + 9)7]{1 - 77), r" = N{a - 9)7]{1 - 77) + Nk, 

and absorbing boundary at 1/2. Fort < c{2N)^^ / k (if k, ^ 0, then t = 00), we 
have 



P [ inf fj{s) > 1 - (2iV)--^/2 ] >P( inf t]{s) > 1 - {2Ny/^ ) > 1 - CN-^/^ 

V s<i / V s<t 



Proof. Wc take C ^ 1 ^ ^ ^^^d perform a time change of 1 — ^ on ^ to obtain a 
process ^ with jump rates 

r+ = N{a - 9)i + Nk/{1 - i), f- = N{a + 0)^. 

Let be a jump process with initial condition ^up(O) = ^(0) = c(2N)^^ , jump 
size 1/(2A^) and jump rates 

r+p = N{a - e)Cup + 2Nk, f'^ = N{a + e)i„p. 



Before the stopping time r = inf{t > : > 1/2}, £^up dominates ^. We can 
write 

diup{t) - dM^^^ + {k- e^up) dt, d{AI^^J = _L(k + aS,up{t)) dt. 



Hence -E[|„p(i)] = f + (c(2A^)--^ - f ) e-"* and by Jensen's and Burkholder's 
inequalities, 



E 



,s<2t 



1/2 



< 
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if Kt < c{2N) ^ , in which case 

P ( sup ^up{s)> i2N)-''^ A < P ( Slip Ah{s) > {2N)-''^^ ~ c{2Ny ~ 4Kt 

\s<2t ) \s<2t 

r ^AA-(^+i)/2 
- (2Ar)-^/2 - c{2N)-^ - ^nt " "'^ 

On the set {sup^<;2t ^np{s) < {2N)^^^^}, certainly does not reach 1/2 before 
time 2t. Hence dominates S, before 2t for uj g {supj,<2t ^up(s) < {2N)~^/'^}, 
which impHes P (sup,^2t^is) < {2Ny/^^ > 1 - Ca^gN''^/^. Because | is the 

process ^ after a time change of 1—^, the clock for ^ runs faster than that of but 
at most twice as fast before ^ reaches 1/2. Therefore the estimate above implies 

P(sup,<,^(s) < (27V)--/2) > P (sup,<2tl(s) < (2iV)--/2) > 1 - C^,gN-'/\ 

The conclusion follows. □ 

6. Appendix: A Result on Branching Processes 

Lemma 6.1. Let ^^'^^ be a branching process with ^(0) = k and u{s) = as^ + b 
be the probability generating function of the offspring distribution. Then 

'^'-')--"""">-( !i::ii:i:::!::::: )' 

(a) If k ~ I and a> b, then 

L \P{i^^Kt) = {))- b/a\ < be-^'^-'^^ya. 

2. P(l < £,^^\t) <K)< Ca^bKe-^''-^^^ ifK< e^'^-^^VG. 

3. P(sup,<,e<i)(s) > K) <Ca.be'^''-''^'/K. 

(b) Ifa<b, then P{^^''Ht) > 0) < 1.2fce^(''-"'*. 

(c) Ifa>bandke [1,K], then P{^^''\t) (E[1,K])< kCa,bKe-^''-^'^K 

(d) If a > b and is a branching process with an initial condition that has 
support on [0,k], then P{^{t) e [1,K]) < kCa,bKe~^''~''^* . Consequently, 



P(C(s) e [1, K] for all s < t) < kCa,bKe 



-(a-b)t 



Proof. The formula for G{s,t) comes from Chapter III. 5 of Athreya & Ney 
(1972). From this formula, we deduce that 

Pi^^'^Ht) = 0) = G(0,t) = . (6.1) 

For (a), we specialise to the case of /c = 1 and a > b. We write ^ = then 

Pirn = 0) - - 



^ (a-6)6e-^°-''^* ^ ^,-(a-6)t 
a(a - 6e-(''-'')*) ~ a 
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as required by (a.l). For s < 1, we have 

CO 

P(l < m <K)< s-^ ^(^W = *)^' = t) - G(0, t)) 

i=l 

ia~b)^s 

~ (a - be-(<'-'>)*){a{e(''-'>)*s^{l - s) + s-^'+i) - bs^) ' 

where G{s,t) — G{0,t) can be computed from (6.1) using elementary algebra. 
The dominant term in the denominator of the above quantity is e'-''~''^*s'^(l — s), 
which achieves the maximum 



g(a-h)t 

K + 1 



1 - 



1 



K + 1 



K 



1 - 



1 



K + 1 



K+1 



at s = K/{K+1). For sufficiently large K, this is at least e'-"'-'''^* / {3K). Therefore 



^'(1 < iit) <K)< 



(a - be-("--'>')*) (a 



3K 



< Cah a 



,{a-b)t 

3K 



-b 



which implies the desired conclusion of (a. 2), if K < eJ"' '')*/6- 

For (a. 3), we observe that M{t) = e~'"^'')*^(t) is a martingale with maxi- 
mum jump size 1 and quadratic variation {M){t) = J* e-'^^''-'''>'{a + b)^{s) ds. 
Burkholder's inequality implies 



E 



sup M{s) 

,s<t 



< C + C e'^^'''''^'{a + b)E[£_{s)] ds 
Jo 

= C + C f e-^'^''-''>ia + b)e'^''-''^' ds<Ca,b. 



Therefore E [sup^<t^(s)] < Ca.be^"^'''>\ which implies (a.3). 
For (b), we observe that 



pie'\t) = o) = {i- 

For sufficiently large i, we have 



b — a 



5g(b-a)i 



1 



b > Jb-a)t „ ^ > J_ (6-a)t 

6-1.1 



therefore 



1 - 



b — a 



ljf,(b-a)t _ ^ 

> 1 - 1.2ke-^''-''^\ 



fcLle-C"-")' 



> e 



-1.2fce-('-'')' 
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if t is sufficiently large and ke is sufficiently small. 

For (c), we observe that ^f*^) = (.[^^ + (.2^ + ■ ■ ■ + where = f,.. 

are independent copies of Therefore 

P{£,''''\t) e [1,K]) < e [1,K] for some i^l,...,k)< kCa.bKe'^''-''^' 

by part (a. 2) of this lemma. Part (d) is a direct consequence of part (c). □ 
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